Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in parse tree testing for every grammar, every input file #4169

Closed
wants to merge 25 commits into from

Conversation

kaby76
Copy link
Contributor

@kaby76 kaby76 commented Jul 23, 2024

This repo currently tests the outputted CST of a parse for a small subset of grammars. With new versions of Antlr coming to fore (different targets, e.g., Wasm, Rust; rewrites, e.g., Antlr4ng), it's important to make sure these produce identical parse trees, checking the correctness of the generated parsers, runtime, and ports. This PR adds in parse tree testing for all test input, for all grammars.

This fixes two issues:

All .tree files are removed and replaced with .ipt "Indented Parse Tree" files. These files are human readable, with each node in the parse tree on a separate line and indented proportional to the depth of the node in the CST. The old Maven tester still works, but only for quick sanity checking of grammar changes. The old tester is no longer used for comparing the outputted CSTs. All input files that are parsed now have a corresponding .ipt file, which must be kept in sync with the grammar and test file.

The code for formatted parse trees is implemented in the driver code for testing in this repo. Antlr does not have any implementation of formatted parse trees, and likely never will, as it is difficult to get buy-in on any new features.

Going forward, people will need to update the .ipt with a change in the grammar. They can do this by:

  • running trgen locally;
  • or, running a build and copying and pasting the resulting trees to the .ipt's.

People will need be careful of the newline characters of both input and .ipt files generated by the parser. The .ipt files must be in Unix-style endings for easy "git diff" comparisons. Any input files for testing should probably be in "eol=lf" representation in the git repo. You will probably need to update the .gitattributes file.

@kaby76
Copy link
Contributor Author

kaby76 commented Nov 10, 2024

I'm going to close this off because it's low-priority. I'd rather see ambiguity and fallback cleaned up. Improving the speed of parsing will be more impactful and help change the pervasive view that Antlr is "garbage", "slow", etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant